{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# RNNs"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "In this notebook you will learn how to build Recurrent Neural Networks (RNNs) for time series forecasting and sequence classification."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Imports"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "%matplotlib inline"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import matplotlib as mpl\n",
    "import matplotlib.pyplot as plt\n",
    "import numpy as np\n",
    "import os\n",
    "import pandas as pd\n",
    "import sklearn\n",
    "import sys\n",
    "import tensorflow as tf\n",
    "from tensorflow import keras\n",
    "import time"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "print(\"python\", sys.version)\n",
    "for module in mpl, np, pd, sklearn, tf, keras:\n",
    "    print(module.__name__, module.__version__)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "assert sys.version_info >= (3, 5) # Python ≥3.5 required\n",
    "assert tf.__version__ >= \"2.0\"    # TensorFlow ≥2.0 required"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "![Exercise](https://c1.staticflickr.com/9/8101/8553474140_c50cf08708_b.jpg)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Exercise 1 – Time series forecasting"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 1.1) Load the data"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let's start with a simple univariate time series: the daily temperatures in Melbourne from 1981 to 1990 ([source](https://datamarket.com/data/set/2324/daily-minimum-temperatures-in-melbourne-australia-1981-1990))."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "temps = pd.read_csv(\"datasets/daily-minimum-temperatures-in-me.csv\",\n",
    "                    parse_dates=[0], index_col=0)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "temps.info()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "temps.head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "temps.plot(figsize=(10,5))\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 1.2) Prepare the data"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "A few dates are missing, for example December 31st, 1984:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "temps.loc[\"1984-12-29\":\"1985-01-02\"]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let's ensure there's one row per day, filling missing values with the previous valid value:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "temps = temps.asfreq(\"1D\", method=\"ffill\")\n",
    "temps.loc[\"1984-12-29\":\"1985-01-02\"]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Alternatively, we could have interpolated using `temps.interpolate()`."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 1.3) Add the shifted columns"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Next, let's create a function to add lag columns:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "def add_lags(series, times):\n",
    "    cols = []\n",
    "    column_index = []\n",
    "    for time in times:\n",
    "        cols.append(series.shift(-time))\n",
    "        lag_fmt = \"t+{time}\" if time > 0 else \"t{time}\" if time < 0 else \"t\"\n",
    "        column_index += [(lag_fmt.format(time=time), col_name)\n",
    "                        for col_name in series.columns]\n",
    "    df = pd.concat(cols, axis=1)\n",
    "    df.columns = pd.MultiIndex.from_tuples(column_index)\n",
    "    return df"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We will try to predict the temperature in 5 days (t+5) using the temperatures from the last 30 days (t-29 to t):"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "X = add_lags(temps, times=range(-30+1,1)).iloc[30:-5]\n",
    "y = add_lags(temps, times=[5]).iloc[30:-5]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "X.head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "y.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Note: you may want to use `keras.preprocessing.sequence.TimeseriesGenerator` or `tf.data.Dataset.window()` instead."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 1.4) Split the dataset"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Split this dataset into three periods: training (1981-1986), validation (1987-1988) and testing (1989-1990)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "#X_train, y_train = ...\n",
    "#X_valid, y_valid = ...\n",
    "#X_test, y_test = ..."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 1.5) Reshape the inputs for the RNN"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Keras and TensorFlow expect a 3D NumPy array for any sequence. Its shape should be (number of instances, number of time steps, number of features per time step). Since this is a univariate time series, the last dimension is 1. Reshape the input features to get 3D arrays:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "#X_train_3D = ...\n",
    "#X_valid_3D = ...\n",
    "#X_test_3D = ..."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 1.6) Build some baseline models"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Build some baseline models (at least one) and evaluate them on the validation set, using the Mean Absolute Error (MAE). For example:\n",
    "\n",
    "* a naive model, that just predicts the last known value.\n",
    "* an EMA model that predicts an exponential moving average of the last 48 hours (you can try to find the best span).\n",
    "* a linear model.\n",
    "\n",
    "Optional: plot the predictions."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 1.7) Build a simple RNN"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Using Keras, build a simple 2-layer RNN with 100 neurons per layer, plus a dense layer with a single neuron. Train the model for 200 epochs with a batch size of 200, using Stochastic Gradient Descent with an learning rate of 0.005. Make sure to print the validation loss during training.\n",
    "\n",
    "Hints:\n",
    "\n",
    "* Create a `Sequential` model.\n",
    "* Add two `SimpleRNN` layers, with 100 units each. The first should return sequences but not the second. Indeed, in a Seq2Vec model, the last RNN layer should not return sequences. The first layer should specify the input shape (i.e., the shape of a single input sequence).\n",
    "* Use the MSE as the loss.\n",
    "* Call the model's `compile()` method, passing it an `SGD` instance with `lr=0.005`.\n",
    "* Call the model's `fit()` method, with the inputs and targets, number of epochs, batch size and validation data."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "#model1 = ..."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 1.8) Plot the history"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Recall that you can simply use `pd.DataFrame(history.history).plot()`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 1.9) Evaluate the model"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Evaluate your RNN on the validation set, using the MAE. Try training your model again using the Huber loss and see if you get better performance."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "def huber_loss(y_true, y_pred, max_grad=1.):\n",
    "    err = tf.abs(y_true - y_pred, name='abs')\n",
    "    mg = tf.constant(max_grad, name='max_grad')\n",
    "    lin = mg * (err - 0.5 * mg)\n",
    "    quad = 0.5 * err * err\n",
    "    return tf.where(err < mg, quad, lin)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 1.10) Plot the predictions"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Make predictions on the validation set and plot them. Compare them to the targets and the baseline predictions."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "![Exercise solution](https://camo.githubusercontent.com/250388fde3fac9135ead9471733ee28e049f7a37/68747470733a2f2f75706c6f61642e77696b696d656469612e6f72672f77696b6970656469612f636f6d6d6f6e732f302f30362f46696c6f735f736567756e646f5f6c6f676f5f253238666c69707065642532392e6a7067)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Exercise 1 – Solution"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 1.1) Load the data"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let's start with a simple univariate time series: the daily temperatures in Melbourne from 1981 to 1990 ([source](https://datamarket.com/data/set/2324/daily-minimum-temperatures-in-melbourne-australia-1981-1990))."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "temps = pd.read_csv(\"datasets/daily-minimum-temperatures-in-me.csv\",\n",
    "                    parse_dates=[0], index_col=0)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "temps.info()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "temps.head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "temps.plot(figsize=(10,5))\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 1.2) Prepare the data"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "A few dates are missing, for example December 31st, 1984:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "temps.loc[\"1984-12-29\":\"1985-01-02\"]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let's ensure there's one row per day, filling missing values with the previous valid value:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "temps = temps.asfreq(\"1D\", method=\"ffill\")\n",
    "temps.loc[\"1984-12-29\":\"1985-01-02\"]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Alternatively, we could have interpolated using `temps.interpolate()`."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 1.3) Add the shifted columns"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Next, let's create a function to add lag columns:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "def add_lags(series, times):\n",
    "    cols = []\n",
    "    column_index = []\n",
    "    for time in times:\n",
    "        cols.append(series.shift(-time))\n",
    "        lag_fmt = \"t+{time}\" if time > 0 else \"t{time}\" if time < 0 else \"t\"\n",
    "        column_index += [(lag_fmt.format(time=time), col_name)\n",
    "                        for col_name in series.columns]\n",
    "    df = pd.concat(cols, axis=1)\n",
    "    df.columns = pd.MultiIndex.from_tuples(column_index)\n",
    "    return df"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "add_lags(temps, times=(-2, -1, 0, +2)).head(10)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We will try to predict the temperature in 5 days (t+5) using the temperatures from the last 30 days (t-29 to t):"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "X = add_lags(temps, times=range(-30+1,1)).iloc[30:-5]\n",
    "y = add_lags(temps, times=[5]).iloc[30:-5]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "X.head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "y.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 1.4) Split the dataset"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let's split this dataset into three periods: training, validation and testing:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "train_slice = slice(None, \"1986-12-25\")\n",
    "valid_slice = slice(\"1987-01-01\", \"1988-12-25\")\n",
    "test_slice = slice(\"1989-01-01\", None)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "X_train, y_train = X.loc[train_slice], y.loc[train_slice]\n",
    "X_valid, y_valid = X.loc[valid_slice], y.loc[valid_slice]\n",
    "X_test, y_test = X.loc[test_slice], y.loc[test_slice]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 1.5) Reshape the inputs for the RNN"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now let's create a function to reshape the multilevel DataFrames to 3D numpy arrays to feed to an RNN:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "def multilevel_df_to_ndarray(df):\n",
    "    shape = [-1] + [len(level) for level in df.columns.remove_unused_levels().levels]\n",
    "    return df.values.reshape(shape)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "X_train_3D = multilevel_df_to_ndarray(X_train)\n",
    "X_valid_3D = multilevel_df_to_ndarray(X_valid)\n",
    "X_test_3D = multilevel_df_to_ndarray(X_test)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "X_train.shape"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "X_train_3D.shape"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 1.6) Build some baseline models"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now let's evaluate some basic strategies, to get some baselines:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from sklearn.metrics import mean_absolute_error"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "def naive(X):\n",
    "    return X.iloc[:, -1]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "y_pred_naive = naive(X_valid)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "mean_absolute_error(y_valid, y_pred_naive)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "def ema(X, span):\n",
    "    return X.T.ewm(span=span).mean().T.iloc[:, -1]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "y_pred_ema = ema(X_valid, span=10)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "mean_absolute_error(y_valid, y_pred_ema)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from sklearn.linear_model import LinearRegression"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "lin_reg = LinearRegression()\n",
    "lin_reg.fit(X_train, y_train)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "y_pred_linear = lin_reg.predict(X_valid)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "mean_absolute_error(y_valid, y_pred_linear)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let's plot these predictions:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "def plot_predictions(*named_predictions, start=None, end=None, **kwargs):\n",
    "    day_range = slice(start, end)\n",
    "    plt.figure(figsize=(10,5))\n",
    "    for name, y_pred in named_predictions:\n",
    "        if hasattr(y_pred, \"values\"):\n",
    "            y_pred = y_pred.values\n",
    "        plt.plot(y_pred[day_range], label=name, **kwargs)\n",
    "    plt.legend()\n",
    "    plt.show()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "plot_predictions((\"Target\", y_valid),\n",
    "                 (\"Naive\", y_pred_naive),\n",
    "                 (\"EMA\", y_pred_ema),\n",
    "                 (\"Linear\", y_pred_linear),\n",
    "                 end=365)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 1.7) Build a simple RNN"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let's create a simple 2-layer RNN with 100 neurons per layer, plus a dense layer with a single neuron:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "input_shape = X_train_3D.shape[1:]\n",
    "input_shape"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "model1 = keras.models.Sequential()\n",
    "model1.add(keras.layers.SimpleRNN(100, return_sequences=True, input_shape=input_shape))\n",
    "model1.add(keras.layers.SimpleRNN(50))\n",
    "model1.add(keras.layers.Dense(1))\n",
    "model1.compile(loss=\"mse\", optimizer=keras.optimizers.SGD(lr=0.005), metrics=[\"mae\"])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "history1 = model1.fit(X_train_3D, y_train, epochs=200, batch_size=200,\n",
    "                      validation_data=(X_valid_3D, y_valid))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 1.8) Plot the history"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "def plot_history(history, loss=\"loss\"):\n",
    "    train_losses = history.history[loss]\n",
    "    valid_losses = history.history[\"val_\" + loss]\n",
    "    n_epochs = len(history.epoch)\n",
    "    minloss = np.min(valid_losses)\n",
    "    \n",
    "    plt.plot(train_losses, color=\"b\", label=\"Train\")\n",
    "    plt.plot(valid_losses, color=\"r\", label=\"Validation\")\n",
    "    plt.plot([0, n_epochs], [minloss, minloss], \"k--\",\n",
    "             label=\"Min val: {:.2f}\".format(minloss))\n",
    "    plt.axis([0, n_epochs, 0, 20])\n",
    "    plt.legend()\n",
    "    plt.show()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "plot_history(history1)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 1.9) Evaluate the model"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "model1.evaluate(X_valid_3D, y_valid)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "def huber_loss(y_true, y_pred, max_grad=1.):\n",
    "    err = tf.abs(y_true - y_pred, name='abs')\n",
    "    mg = tf.constant(max_grad, name='max_grad')\n",
    "    lin = mg * (err - 0.5 * mg)\n",
    "    quad = 0.5 * err * err\n",
    "    return tf.where(err < mg, quad, lin)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "model1 = keras.models.Sequential()\n",
    "model1.add(keras.layers.SimpleRNN(100, return_sequences=True, input_shape=input_shape))\n",
    "model1.add(keras.layers.SimpleRNN(100))\n",
    "model1.add(keras.layers.Dense(1))\n",
    "model1.compile(loss=huber_loss, optimizer=keras.optimizers.SGD(lr=0.005), metrics=[\"mae\"])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "history1 = model1.fit(X_train_3D, y_train, epochs=200, batch_size=200,\n",
    "                      validation_data=(X_valid_3D, y_valid))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "model1.evaluate(X_valid_3D, y_valid)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 1.10) Plot the predictions"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "y_pred_rnn1 = model1.predict(X_valid_3D)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "plot_predictions((\"Target\", y_valid),\n",
    "                 (\"Linear\", y_pred_linear),\n",
    "                 (\"RNN\", y_pred_rnn1),\n",
    "                 end=365)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "![Exercise](https://c1.staticflickr.com/9/8101/8553474140_c50cf08708_b.jpg)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Exercise 2 – Forecasting the shifted sequence (Seq2Seq)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now let's predict temperatures for 30 days (from t-24 to t+5) instead of just one."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 2.1) Define the 3D targets for training, validation and testing"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "#Y_train_3D = ...\n",
    "#Y_valid_3D = ...\n",
    "#Y_test_3D = ..."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 2.2) Define an `mae_last_step()` function"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "For the final evaluation, we only want to look at the final time step (t+5). Create an `mae_last_step()` function that computes the MAE based on the final time step."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 2.3) Build a Seq2Seq model"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Build a Seq2Seq model and compile it, using the Huber Loss, and using the last step MAE as the metric. Use SGD with a learning rate of 0.01. Hint: the layers are the same as earlier, except that the last RNN layer has `return_sequences=False`, and the `Dense` layer must be wrapped in a `keras.layers.TimeDistributed` layer."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 2.4) Train the model"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Fit the model as earlier (but with the 3D targets). Again, evaluate the model and plot the predictions."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "![Exercise solution](https://camo.githubusercontent.com/250388fde3fac9135ead9471733ee28e049f7a37/68747470733a2f2f75706c6f61642e77696b696d656469612e6f72672f77696b6970656469612f636f6d6d6f6e732f302f30362f46696c6f735f736567756e646f5f6c6f676f5f253238666c69707065642532392e6a7067)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Exercise 2 – Solution"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 2.1) Define the 3D targets for training, validation and testing"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "Y = add_lags(temps, times=range(-24, 5+1)).iloc[30:-5]\n",
    "Y_train = Y.loc[train_slice]\n",
    "Y_valid = Y.loc[valid_slice]\n",
    "Y_test = Y.loc[test_slice]\n",
    "Y_train_3D = multilevel_df_to_ndarray(Y_train)\n",
    "Y_valid_3D = multilevel_df_to_ndarray(Y_valid)\n",
    "Y_test_3D = multilevel_df_to_ndarray(Y_test)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 2.2) Define an `mae_last_step()` function"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "For the final evaluation, we only want to look at the final time step (t+5):"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "K = keras.backend\n",
    "\n",
    "def mae_last_step(Y_true, Y_pred):\n",
    "    return K.mean(K.abs(Y_pred[:, -1] - Y_true[:, -1]))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 2.3) Build a Seq2Seq model"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "model2 = keras.models.Sequential()\n",
    "model2.add(keras.layers.SimpleRNN(100, return_sequences=True, input_shape=input_shape))\n",
    "model2.add(keras.layers.SimpleRNN(100, return_sequences=True))\n",
    "model2.add(keras.layers.TimeDistributed(keras.layers.Dense(1)))\n",
    "model2.compile(loss=huber_loss, optimizer=keras.optimizers.SGD(lr=0.01),\n",
    "               metrics=[mae_last_step])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 2.4) Train the model"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "history2 = model2.fit(X_train_3D, Y_train_3D, epochs=200, batch_size=200,\n",
    "                      validation_data=(X_valid_3D, Y_valid_3D))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "plot_history(history2, loss=\"mae_last_step\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "model2.evaluate(X_valid_3D, Y_valid_3D)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "y_pred_rnn2 = model2.predict(X_valid_3D)[:, -1]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "plot_predictions((\"Target\", y_valid),\n",
    "                 (\"Linear\", y_pred_linear),\n",
    "                 (\"RNN\", y_pred_rnn2),\n",
    "                 end=365)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "![Exercise](https://c1.staticflickr.com/9/8101/8553474140_c50cf08708_b.jpg)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Exercise 3 – LSTM and GRU"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 3.1) Build, train and evaluate a Seq2Seq LSTM"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Train the same model as earlier but using `LSTM` or `GRU` instead of `SimpleRNN`. You can also try reducing the learning rate when the validation loss reaches a plateau, using the `ReduceLROnPlateau` callback."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 3.2) Add $\\ell_2$ regularization"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Add $\\ell_2$ regularization to your RNN, using the layers' `kernel_regularizer` and `recurrent_regularizer` arguments, and the `l2()` function in `keras.regularizers`. Tip: use the `partial()` function in the `functools` package to avoid repeating the same arguments again and again."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "![Exercise solution](https://camo.githubusercontent.com/250388fde3fac9135ead9471733ee28e049f7a37/68747470733a2f2f75706c6f61642e77696b696d656469612e6f72672f77696b6970656469612f636f6d6d6f6e732f302f30362f46696c6f735f736567756e646f5f6c6f676f5f253238666c69707065642532392e6a7067)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Exercise 3 – Solution"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 3.1) Build, train and evaluate a Seq2Seq LSTM"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "You can try replacing `LSTM` with `GRU`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "model3 = keras.models.Sequential()\n",
    "model3.add(keras.layers.LSTM(100, return_sequences=True, input_shape=input_shape))\n",
    "model3.add(keras.layers.LSTM(100, return_sequences=True))\n",
    "model3.add(keras.layers.TimeDistributed(keras.layers.Dense(1)))\n",
    "model3.compile(loss=huber_loss, optimizer=keras.optimizers.SGD(lr=0.01),\n",
    "               metrics=[mae_last_step])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "history3 = model3.fit(X_train_3D, Y_train_3D, epochs=200, batch_size=200,\n",
    "                      validation_data=(X_valid_3D, Y_valid_3D),\n",
    "                      callbacks=[keras.callbacks.ReduceLROnPlateau(verbose=1)])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "model3.evaluate(X_valid_3D, Y_valid_3D)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "plot_history(history3, loss=\"mae_last_step\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "y_pred_rnn3 = model3.predict(X_valid_3D)[:, -1]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "plot_predictions((\"Target\", y_valid),\n",
    "                 (\"Linear\", y_pred_linear),\n",
    "                 (\"RNN\", y_pred_rnn3),\n",
    "                 end=365)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 3.2) Add $\\ell_2$ regularization"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from functools import partial"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "RegularizedLSTM = partial(keras.layers.LSTM,\n",
    "                          return_sequences=True,\n",
    "                          kernel_regularizer=keras.regularizers.l2(1e-4),\n",
    "                          recurrent_regularizer=keras.regularizers.l2(1e-4))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "model3 = keras.models.Sequential()\n",
    "model3.add(RegularizedLSTM(100, input_shape=input_shape))\n",
    "model3.add(RegularizedLSTM(100))\n",
    "model3.add(keras.layers.Dense(1))\n",
    "model3.compile(loss=huber_loss, optimizer=keras.optimizers.SGD(lr=0.01),\n",
    "               metrics=[mae_last_step])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "history3 = model3.fit(X_train_3D, Y_train_3D, epochs=200, batch_size=100,\n",
    "                      validation_data=(X_valid_3D, Y_valid_3D))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "model3.evaluate(X_valid_3D, Y_valid_3D)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "plot_history(history3)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "y_pred_rnn3 = model3.predict(X_valid_3D)[:, -1]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "plot_predictions((\"Target\", y_valid),\n",
    "                 (\"Linear\", y_pred_linear),\n",
    "                 (\"RNN\", y_pred_rnn3),\n",
    "                 end=365)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "![Exercise](https://c1.staticflickr.com/9/8101/8553474140_c50cf08708_b.jpg)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Exercise 4 – Preprocessing with 1D-ConvNets"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "At the beginning of your sequential model, add a `Conv1D` layer with 32 kernels of size 5, a `MaxPool1D` layer with pool size 5 and strides 2. Train and evaluate the model."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "![Exercise solution](https://camo.githubusercontent.com/250388fde3fac9135ead9471733ee28e049f7a37/68747470733a2f2f75706c6f61642e77696b696d656469612e6f72672f77696b6970656469612f636f6d6d6f6e732f302f30362f46696c6f735f736567756e646f5f6c6f676f5f253238666c69707065642532392e6a7067)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Exercise 4 – Solution"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "model4 = keras.models.Sequential()\n",
    "model4.add(keras.layers.Conv1D(32, kernel_size=5, input_shape=input_shape))\n",
    "model4.add(keras.layers.MaxPool1D(pool_size=5, strides=2))\n",
    "model4.add(keras.layers.LSTM(32, return_sequences=True))\n",
    "model4.add(keras.layers.LSTM(32))\n",
    "model4.add(keras.layers.Dense(1))\n",
    "model4.compile(loss=huber_loss, optimizer=keras.optimizers.SGD(lr=0.005))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "model4.summary()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "history4 = model4.fit(X_train_3D, y_train, epochs=200, batch_size=100,\n",
    "                      validation_data=(X_valid_3D, y_valid))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "model4.evaluate(X_valid_3D, y_valid)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "![Exercise](https://c1.staticflickr.com/9/8101/8553474140_c50cf08708_b.jpg)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Exercice 5 – Sequence classification"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let's load the IMDB movie reviews, for binary sentiment analysis (positive review or negative review):"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We only want the 10,000 most common words:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "num_words = 10000\n",
    "(X_train, y_train), (X_test, y_test) = keras.datasets.imdb.load_data(num_words=num_words)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let's also get the word index (word to word id):"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "word_index = keras.datasets.imdb.get_word_index()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "And let's create a reverse index (word id to word). Three special word ids are added:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "reverse_index = {word_id + 3: word for word, word_id in word_index.items()}\n",
    "reverse_index[0] = \"<pad>\" # padding\n",
    "reverse_index[1] = \"<sos>\" # start of sequence\n",
    "reverse_index[2] = \"<oov>\" # out-of-vocabulary\n",
    "reverse_index[3] = \"<unk>\" # unknown"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let's write a little function to decode reviews:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "def decode_review(word_ids):\n",
    "    return \" \".join([reverse_index.get(word_id, \"<err>\") for word_id in word_ids])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let's look at a review:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "decode_review(X_train[0])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "It seems very positive, let's look at the target (0=negative review, 1=positive review):"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "y_train[0]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "And another review:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "decode_review(X_train[1])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Very negative! Let's check the target:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "y_train[1]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 5.1) Train a baseline model"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Train and evaluate a baseline model using ScikitLearn. You will need to create a pipeline with a `CountVectorizer`, a `TfidfTransformer` and an `SGDClassifier`. The `CountVectorizer` transformer expects text as input, so let's create a text version of the training set and test set:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "X_train_text = [decode_review(words_ids) for words_ids in X_train]\n",
    "X_test_text = [decode_review(words_ids) for words_ids in X_test]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from sklearn.feature_extraction.text import CountVectorizer, TfidfTransformer\n",
    "from sklearn.pipeline import Pipeline\n",
    "from sklearn.linear_model import SGDClassifier"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 5.2) Create a sequence classifier"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Create a sequence classifier using Keras:\n",
    "* Use `keras.preprocessing.sequence.pad_sequences()` to preprocess `X_train`: this will create a 2D array of 25,000 rows (one per review) and `maxlen=500` columns. Reviews longer than 500 words will be cropped, while reviews shorter \n",
    "than 500 words will be padded with zeros.\n",
    "* The first layer in your model should be an `Embedding` layer, with `input_dim=num_words` and `output_dim=10`. The model will gradually learn to represent each of the 10,000 words as a 10-dimensional vector. So the next layer will receive 3D batchs of shape (batch size, 500, 10).\n",
    "* Add one or more LSTM layers with 32 neurons each.\n",
    "* The output layer should be a Dense layer with a sigmoid activation function, since this is a binary classification problem.\n",
    "* When compiling the model, you should use the `binary_crossentropy` loss.\n",
    "* Fit the model for 10 epochs, using a batch size of 128 and `validation_split=0.2`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "![Exercise solution](https://camo.githubusercontent.com/250388fde3fac9135ead9471733ee28e049f7a37/68747470733a2f2f75706c6f61642e77696b696d656469612e6f72672f77696b6970656469612f636f6d6d6f6e732f302f30362f46696c6f735f736567756e646f5f6c6f676f5f253238666c69707065642532392e6a7067)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Exercice 5 – Solution"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 5.1) Train a baseline model"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "X_train_text = [decode_review(words_ids) for words_ids in X_train]\n",
    "X_test_text = [decode_review(words_ids) for words_ids in X_test]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from sklearn.feature_extraction.text import CountVectorizer, TfidfTransformer\n",
    "from sklearn.pipeline import Pipeline\n",
    "from sklearn.linear_model import SGDClassifier"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "pipeline = Pipeline([\n",
    "    ('vect', CountVectorizer()),\n",
    "    ('tfidf', TfidfTransformer()),\n",
    "    ('clf', SGDClassifier(max_iter=50)),\n",
    "])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "pipeline.fit(X_train_text, y_train)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "pipeline.score(X_test_text, y_test)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We get 88.5% accuracy, that's not too bad. But don't forget to check the ratio of positive reviews:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "y_test.mean()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let's try our model:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "pipeline.predict([\"this movie was really awesome\"])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 5.2) Create a sequence classifier"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "maxlen = 500\n",
    "X_train_trim = keras.preprocessing.sequence.pad_sequences(X_train, maxlen=maxlen)\n",
    "X_test_trim = keras.preprocessing.sequence.pad_sequences(X_test, maxlen=maxlen)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "model = keras.models.Sequential()\n",
    "model.add(keras.layers.Embedding(input_dim=num_words, output_dim=10))\n",
    "model.add(keras.layers.LSTM(32))\n",
    "model.add(keras.layers.Dense(1, activation=\"sigmoid\"))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "model.compile(loss=\"binary_crossentropy\", optimizer=\"rmsprop\", metrics=[\"accuracy\"])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "model.summary()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "history = model.fit(X_train_trim, y_train,\n",
    "                    epochs=10, batch_size=128, validation_split=0.2)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "model.evaluate(X_test_trim, y_test)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "![Exercise](https://c1.staticflickr.com/9/8101/8553474140_c50cf08708_b.jpg)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Exercise 6 – Bidirectional RNN"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Update the previous sequence classification model to use a bidirectional LSTM. For this, you just need to wrap the LSTM layer in a `Bidirectional` layer. If the model overfits, try adding a dropout layer."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "![Exercise solution](https://camo.githubusercontent.com/250388fde3fac9135ead9471733ee28e049f7a37/68747470733a2f2f75706c6f61642e77696b696d656469612e6f72672f77696b6970656469612f636f6d6d6f6e732f302f30362f46696c6f735f736567756e646f5f6c6f676f5f253238666c69707065642532392e6a7067)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Exercise 6 – Solution"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "model = keras.models.Sequential()\n",
    "model.add(keras.layers.Embedding(input_dim=num_words, output_dim=10))\n",
    "model.add(keras.layers.Dropout(0.5))\n",
    "model.add(keras.layers.Bidirectional(keras.layers.LSTM(32)))\n",
    "model.add(keras.layers.Dense(1, activation=\"sigmoid\"))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "model.compile(loss=\"binary_crossentropy\", optimizer=\"rmsprop\", metrics=[\"accuracy\"])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "history = model.fit(X_train_trim, y_train,\n",
    "                    epochs=10, batch_size=128, validation_split=0.2)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "model.evaluate(X_test_trim, y_test)"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.8"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
